A Semantic Web Resource Protocol: XPointer and HTTP
نویسندگان
چکیده
Semantic Web resources — that is, knowledge representation formalisms existing in a distributed hypermedia system — require different addressing and processing models and capacities than the typical kinds of World Wide Web resources. We describe an approach to building a Semantic Web resource protocol — a scalable, extensible logical addressing scheme and transport protocol — by using and extending existing specifications and technologies. We introduce XPointer and some infrequently used, but useful features of HTTP/1.1, in order to support addressing and server side processing of resource and subresource operations. We consider applications of the XPointer Framework for use in the Semantic Web, particularly for RDF and OWL resources and subresources. We describe two initial implementations: filtering of RSS resources by date and item range; RDF subresource selection using RDQL. Finally, we describe possible application to the problem of OWL imports. 1 RDF, RDFS, and OWL RDF, RDFS, and OWL form the middle layers of the so-called Semantic Web layer cake. They are also its transitional layers; the parts beneath RDF-RDFSOWL are part of the existing Web and are not distinctively semantic. These lower layers — URIs, Unicode, XML, and W3C XML Schema (WXS) — are not KR formalisms, though they do contribute, especially XML and WXS, to the syntax and the semantics of RDF, RDFS, and OWL. RDF, RDFS, and OWL are minimalist formalisms, defined primarily as logics independent of particular theories of the world or of the Web. At best they have a thin upper ontology for representing aspects of their own abstract syntax and semantics. For example, the hierarchy of builtin classes of OWL Full, which is the richest of the OWL variants in terms of expressiveness and predefined ontology, contains a universal class (rdfs:Resource), several classes of classes and of properties (rdfs:Class, owl:Class, rdf:Property, rdf:AnnotationProperty, etc.), along with a few individuals (rdf:nil). Further, certain kinds of inference are licensed in RDF, RDFS, and OWL. If, for example, an RDF graph contains a triple (s p o), then there is at least one other triple (p rdf:type rdf:Property) that is RDF-entailed by that graph. This is unusual from a first order logic point of view; it is not merely a syntactically sensitive inference. Rather, it derives a claim about that syntax, which highlights the strong reflective modeling capabilities deemed essential to the Semantic Web. But putting aside, first, the desirability of such reflective modeling capabilities or, second, the specific capabilities of OWL Full, there are two ways in which the thin upper ontology formed by RDFS and OWL Full, together with the associated inference mechanisms, fails to cover existing, critical aspects of the Web and the Semantic Web: 1. There are no classes or properties to represent RDF graphs or documents themselves. RDF graphs are a fundamental kind of Semantic Web resource. 2. There is no connection between the defined semantics of URIs and the assertions, especially type assertions, using them. For example, it is not unreasonable, and it has sometimes been proposed, that mailto:’ URIs only denote things that are mailboxes or mail address endpoints. Since there appears to be no consensus for specific sets of URI as to their proper associated type, standardization of such inferences would be premature. However, it is important to be able easily to coin sensible URIs for mailboxes, Web pages, people, and RDF graphs. Taking just the case of RDF graphs, it is simple enough to offer some sensible principles to resource publishers and URI owners. For example, treat RDF graphs as normal Web accessible information resources. The simplest application of this principle is to publish RDF graphs as RDF/XML documents on the Web, allowing others to interact with those graphs using HTTP. This deployment practice will suffice if two conditions are 5 Of course, in OWL Full all classes and properties are rdfs:Resource and, thus, individuals in the domain as well. 6 The canonical discussion of URI ownership is [10], especially section 2.2.1.1. 7 Or, as [10] puts it, A URI owner may, upon request, provide representations of the resource identified by the URI. For example, when a URI owner uses the HTTP protocol to provide those representations, the HTTP origin server...is the software agent acting on behalf of the URI owner to provide the authoritative representations for the resource identified by that URI. The owner is also responsible for accepting or rejecting requests to modify the resource identified by that URI... met: first, that the graphs we want to manipulate directly are relatively small; and, second, that we can anticipate the subgraphs users will need and provide the appropriate document interface to them. When RDF graphs or OWL ontologies are very large, retrieving the entire graph as a document to be processed locally by the client is often impractical, especially if the graph is dynamic or if the client is resource-constrained. Further, it is not at all clear that standard techniques for mapping resource representations to documents fits Semantic Web resource representations very well. While for many applications RDF data has an obvious document chunking — for example, individual Web resource metadata can be embedded or associated with the corresponding resource — it can be difficult to anticipate the range of kinds of queries a client may need to execute. And many common queries may cut across the obvious or common case RDF chunking. While it is natural to associate creator information with created resources, it is perfectly sensible to want to know all the creators for a site, all the resources created by the current page’s creators, and so on. In such cases, having to aggregate the chunking RDF is worse than simply downloading a very large RDF/XML document. These problems and constraints point to the need for a logical, extensible means of identifying and addressing Semantic Web resources, preferably in a way that is as consonant as possible with existing standards and deployed systems. 2 Web Addressing and Fragment Identifiers According to [2], in order to authoritatively interpret a URI fragment identifier, a user-agent must dereference the URI containing the fragment identifier, using the Internet Media Type (IMT) [7] of the retrieved representation to apply an authoritative interpretation function to the fragment identifier. When dereferencing URIs that return HTML representations, the user-agent is usually acting on behalf of a human person who is navigating web hypermedia resources. In this case the semantics of the fragment identifier are interpreted by a web browser such that the subresource identified by the fragment identifier 8 For example, the National Cancer Institute’s Cancer Ontology, an OWL version of the NCI Thesaurus, is made up of more than 500,000 RDF triples, 17,000 concepts, and is roughly 35 megabytes. Or consider the RDF/XML serialization of UniProt (Universal Protein Resource) KB which is about 134,000,000 triples and 8 gigabytes. 9 A fragment identifier, according to [10], allows indirect identification of a secondary resource by reference to a primary resource and additional identifying information. The secondary resource may be some portion or subset of the primary resource, some view on representations of the primary resource, or some other resource defined or described by those representations. 10 See [10], section 3.3.1. is made visible in the browser window. Fragment identifiers do not allow useragents to manipulate subresources directly; they are, rather, processed by the user-agent only after the resource representation has been retrieved. In fact, HTTP does not even transmit the fragment identifier as part of the Request-URI. Origin servers never see the fragment identifier component of the URI. HTTP does, however, provide an extensible mechanism for user-agents to interact with a range of the resource representation, using the Range header. The only range-unit that is explicitly described by the HTTP/1.1 RFC is “bytes”. User-agents may address one or more byte ranges in the retrieved resource representation. The bytes range-unit is useful for some IMTs; for example, retrieving a byte-range of a URI that returns a binary image as its representation. The HTTP specification breaks transparency and encourages caches to combine byte ranges under certain validity conditions. The bytes range-unit is obviously much less useful for Semantic Web resources because it isn’t appropriate for directly addressing and manipulating logical or semantic subresources.
منابع مشابه
XML Pointer Language
XPath, described in detail in the previous chapter, provides a common foundation for other standards that need to address into XML documents. One such standard, and the most interesting with regard to implementing hypermedia based on XML technologies, is the XML Pointer Language (XPointer) [DeRose+ 01a], which is used for fragment identifiers for XML resources. According to RFC 3023 [Murata+ 01...
متن کاملAHP Techniques for Trust Evaluation in Semantic Web
The increasing reliance on information gathered from the web and other internet technologies raise the issue of trust. Through the development of semantic Web, One major difficulty is that, by its very nature, the semantic web is a large, uncensored system to which anyone may contribute. This raises the question of how much credence to give each resource. Each user knows the trustworthiness of ...
متن کاملAHP Techniques for Trust Evaluation in Semantic Web
The increasing reliance on information gathered from the web and other internet technologies raise the issue of trust. Through the development of semantic Web, One major difficulty is that, by its very nature, the semantic web is a large, uncensored system to which anyone may contribute. This raises the question of how much credence to give each resource. Each user knows the trustworthiness of ...
متن کاملWriting the Semantic Web with Java
To build semantic web applications, developers must master three things at the same time: Their programming language, the semantic web languages (RDF, RDFS and OWL), and the web protocols (HTTP). This paper presents a framework, semweb4j that turns Java developers into semantic web developers without requiring them to learn RDF, RDFS, HTTP or Servlets. We present a triple store abstraction laye...
متن کاملSemantic Web and the Libraries: An Overview
This paper discusses about the concept of semantic web, the technology, web content writing, and necessity for the development of web 3.0. The various components of semantic web technology such as HTTP, URI, RDF, XML, Ontology, W3C and other components specified as W3C standards are touched upon briefly. The benefits of implementing semantic web in the Library functions to provide effective inf...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004